Method of searching novel ligand compounds from three-dimensional structure database

ABSTRACT

A method of searching one or more ligand-candidate compounds to a target biopolymer from a three-dimensional structure database, which comprises the steps of: (i) the first step of assigning hydrogen-bonding category numbers, information for calculating force-field energy, and information for generating conformations to two or more trial compounds in addition to atomic three-dimensional coordinates thereof; (ii) the second step of preparing physicochemical information about a ligand-binding region and one or more dummy atoms based on the three-dimensional atomic coordinates of the target biopolymer; (iii) the third step of estimating the most stable docking structure, wherein said step further comprises the steps of examining possible docking structures by docking a trial compound to the biopolymer while varying conformations of the trial compound, evaluating interaction energies between the target biopolymer and the trial compound, and repeating structure optimization; (iv) the fourth step of deciding whether or not the trial compound should be adopted as a ligand-candidate compound based on given criteria including the interaction energy values between the target biopolymer and the trial compound in the most stable docking structure estimated according to the step (iii); and (v) the fifth step of repeating the step (iii) and the step (iv) for all of the trial compounds.

[0001] This application is a divisional of application Ser. No.10/022,275, filed Dec. 20, 2001, which is a continuation of U.S.application Ser. No. 08/817,646, filed Apr. 29, 1997, now U.S. Pat. No.6,389,378, both of which are hereby incorporated by reference in theirentirety, and which is a National Stage application of PCT/JP95/02219,filed Oct. 30, 1995, which was not published in English under PCTArticle 21(2). The present application claims priority under 35 U.S.C.§119 of Japanese Application No. 6-267688 filed Oct. 31, 1994.

TECHNICAL FIELD

[0002] The present invention relates to a method of searching athree-dimensional structure databases, which can be utilized for findingnovel lead-structures of medicaments, pesticides, and other biologicallyactive compounds.

BACKGROUND ART

[0003] Medicaments generally exhibit their biological activities throughstrong interactions with target biopolymers. In recent years,three-dimensional structures of biopolymers, which play important rolesin living bodies, have been revealed successively on the basis ofprogresses in X-ray crystallographic analyses and NMR technology. Withthe advance of these researches, methodologies of theoretical approachbased on information about three-dimensional structures of biopolymersand their successful results have been reported with respect to leadgenerations of new drugs, which were conventionally achieved mostly byrandom screenings or accidental discoveries. Importance of such approachis recognized with increasing interest, and researches from variousviewpoints have been conducted in all the world.

[0004] An object of these researches is to provide a process forcreation of ligand candidate structures by means of a computer. Anotherobject is to provide a process for searching for ligand structures froma three-dimensional structure database containing known compounds. Theformer (automatic structure construction approach) is advantageous inthat the process can suggest a wide variety of possible ligandstructures irrespective of known or unknown structures. On the otherhand, the latter (database approach) is used in searching for novelbiologically active compounds from databases comprised of accumulatedstructures of known compounds. When a search is applied to a databasewhich consists of compounds stocked in an own company or commerciallyavailable compounds, this method is particularly advantageous in thatcompounds, this method is particularly advantageous in that compoundsfound to reach criteria of hit (“hit compounds”) can be obtained withoutany synthetic efforts, and their association constants to targetbiopolymers and biological activities can be measured.

[0005] Synthesis of a compound with a novel molecular skeleton generallyneeds several months even for an expert, and accordingly, years of andburden of work are required to synthesize a lot of promising structuresand measuring their activities. However, if hit compounds are alreadyavailable, measurements of association constants and biologicalactivities can be easily performed for dozens of promising ligandcandidate compounds. In addition, based on the structure of a compoundfound to have a considerable extent of high association constant oractivity, an efficient lead generation is achievable by designingcompounds with higher association constants and biological activitiesand synthesizing the compounds. For these reasons, approaches haverecently been interested which comprises the steps of searchingthree-dimensional structure databases consisting of known compounds anddiscovering novel pharmacologically active compounds.

[0006] However, one of problems of this method is by what query thesearch is carried out. Generally, a three-dimensional structure databaseis searched by queries of whether or not given atomic groups orfunctional groups exist which are presumed as essential for the desiredactivity in typical biologically active compounds used as templates, oralternatively, whether or not their relative positions are similar tothose of the template compounds. Where the structure of a targetbiopolymer is unknown, basically no approach can be relied on other thanthis process. However, since these queries are based on assumptions andhypotheses, hit compounds may often fail to have the same activity asthat of the template compounds. Where two or more molecules with quitedifferent structural features exhibit similar activities, it can onlybecome possible to create more probable operating hypotheses, withreference to functional groups essential for the activity and theirrelative positions, by means of structural information about thesemolecules.

[0007] The most difficult problem in the search of three-dimensionalstructure databases lies in the handling of conformational flexibilityof compounds. It has been known that the conformations of a ligand as itbinds to a biopolymer (i.e., the most stable complex is formed) do notnecessarily accord with any of stable structures of the molecule itself,in the state of a crystal or a solution, or a structure with the moststable energy obtained by energy calculation, and that one ligandmolecule can form stable complexes in different conformations dependingon target biopolymers. Generally, a set of atomic coordinatesrepresenting a three-dimensional structure of one conformation amongmultiple possible conformations is contained as for one compound in adatabase. Furthermore, except for databases derived from crystalstructures, information of three-dimensional structure of each compoundis obtained from two-dimensionally inputted structure throughcalculation. These three-dimensional structures often represent one oflocal minimum structures that can be taken by the molecule, per se.

[0008] Therefore, if a search is carried out merely on the basis ofconformations contained in a database to judge whether or not compoundsmeet the criteria of hit, most compounds fail to be selected whichshould be hit if other conformations are considered. Although the numberof conformations to be considered may vary depending on the number ofrotatable bonds, as well as factors such as degrees of preciseconsideration of conformations, several tens to hundreds of thousandsconformations should be taken into account for a moderately flexiblemolecule containing 3 to 6 rotatable bonds. In order to consider thesepossible conformations, available processes are limited to either ofthose comprising the steps of selecting promising conformations andinputting them in a database beforehand, or alternatively, generatingconformations and examining at the time of conducting a search. In anyevents, enormous computer resources and calculation time are required.

[0009] Recently, Kearsley et al. with Merck prepared a database whichcomprises 20 conformations at most for a single compound and reported“FLOG,” a searching system in which a search is carried out usingsearching standard mainly consisting of relative positions of functionalgroups (M. D. Miller, S. K., Kearsley, D. J. Underwood, and R. P.Sheridan: Journal of Computer-Aided Molecular Design, 8, 1994, pp.153-174, FLOG: A system to select ‘quasi-flexible’ ligands complementaryto a receptor of known three-dimensional structure). It was reportedthat the FLOG took approximately one week to complete searching adatabase containing 2,000,000 conformations for 100,000 compounds bymeans of a super computer CRAY. The twenty conformations in average foreach compound are stored by selecting energetically stable local minimumstructures through prior conformational analysis of each compound, forwhich enormous time and burden of work are needed. Nevertheless, twentyconformations per a single compound are insufficient.

[0010] In addition, the method adopts the query which comprise thepresence or absence of functional groups presumably essential for theactivity of a compound as a template, as well as distance, angle, ordirection between the groups. These queries are most plain amongpossible queries, and although they have advantages to shortencalculation time using simple algorithm, high probability cannot beexpected that hit compounds actually have the desired activity. Thereason lies in that a molecule having inappropriate whole molecularshape and size fails to exhibit activity, even if functional groupsmerely have desired relative positions. If a three-dimensional structureof the target biopolymer is known, the most effective search can beachieved by means of such information, which provides high probabilitythat hit compounds have the desired activity.

[0011] For example, Eyermann et al. with Dupont Merck performed adatabase search based on relative positions of functional groups ofligand molecules which had been analyzed by X-ray crystallography as acomplex with a biopolymer (P. Y. S. Lam, P. K. Jadhav, C. J. Eyemann, C.N. Hodge, Y. Ru, L. T. Bacheler, J. L. Meek, M. J. Otto, M. M. Rayner,Y. N. Wong, C. H. Chang, P. C. Weber, D. A. Jackson, T. R. Sharp, and S.Erickson-Viitanen: Science, 263, 1994, pp.380-384, Rational design ofPotent, Bioavailable, Nonpeptide Cyclic Ureas as HIV ProteaseInhibitors.). In the HIV protease system, compounds were searched for asto criteria where the three positions of an oxygen atoms in watermolecules are in similar relative positions which connect the proteaseto the ligand through hydrogen bonds at the centers of two benzene ringsin the peptide ligand molecule. Among the hit compounds, one compoundwas selected which well fitted into the ligand-binding region and formedhydrogen bonds other than those formed by the oxygen atoms used asquery. They reported that a compound with high activity was developedbased on the structure of this compound by repeated works ofsynthesizing compounds with appropriate modifications while measuringinhibitory activities against the enzyme. However, the publication lacksdetailed descriptions as for the search process, and their technique ofhandling conformations is not deducible.

[0012] The search query capable of providing the highest probabilitythat hit compounds have desired activity is that whether or notcompounds stably bind to a ligand-binding region of the targetbiopolymer. Although in some cases the desired activity fails to beexhibited due to physicochemical factors such as water solubility evenif a compound satisfies the criteria, reaching the criteria is anessential requirement for the expression of activity. On the other hand,the presence or absence of specific functional groups and the relativepositions of functional groups are not necessarily essential for theexpression of activity, and even if a compound falls within a differentfamily and has dissimilar skeletal structure to the known activestructures, the compound may possibly exhibit the same activity if itcan form a complex between the target biopolymer at a similar extent ofstability. Therefore, compounds satisfying the criteria that they canstably bind to a ligand-binding region of the target biopolymer havehigh probability of being actual candidates as ligands. In other words,by using the criteria of fitting to a ligand-binding region of thetarget biopolymer, it becomes possible to identify compounds whichexhibit the desired biological activity and have a wide variety ofstructures from databases.

[0013] In order to determine whether or not a given ligand molecule canbind to the target biopolymer, it is necessary to find the most stabledocking structure among possible docking structures and to know thedegree of stability of the docking structure. Also, in order to find themost stable docking structure between a given biopolymer and a ligandmolecule, it is necessary to estimate stabilities of all dockingstructures by considering all possible binding modes (corresponding torotation or translation of one molecule while fixing another compound),and possible ligand conformations. However, since the process needsenormous calculation, it cannot be achieved by interactive methods usinga graphic display. Therefore, the development of an automatic dockingmethod has been desired which can perform the above processautomatically and efficiently.

[0014] As automatic docking methods, Kuntz et al. developed a method forestimating possible binding modes by representing the shape of aligand-binding region by means of several to dozens of inscribed spheresand comparing a set of vectors formed by the centers of the spheres witha set of intramolecular vectors of a ligand compound (R. L. Desjarlais,R. P. Sheridan, G. L. Seibel, J. S. Dixon, I. D. Kuntz and R.Venkataraghavan: J. Med. Chem. 31 (1989) 722-729). Using ShapeComplementarity as an Initial Screen in Designing Ligands for aReceptor-Binding Site of Known 3-Dimensional Structure.). In a recentimproved process, the process is modified so that intermolecular energycan be calculated in addition to the score representing coincidencebetween the vectors.

[0015] However, this method is disadvantageous because it needs an unduetime even only for covering all binding modes. Some improved processesinclude alternations so that conformations can be varied for the dockingof a single compound. However, the processes are not designed so as toprovide an automatic procedure covering all of the docking processes.Although energy evaluation can be done concerning hit compounds, onlysimple scores can be calculated and hydrogen bonds or the like aredisregarded during the docking process. Furthermore, its accuracy isinsufficient, i.e., there is a problem in that, for example, the truedocking structure determined by crystal analysis fails to be judged asthe highest rank and is sometimes judged as rather low in rank.Moreover, its unavoidable defect is in that the process is carried outfundamentally by a fixed conformation for each compound, andconformational flexibilities cannot be handled. In a manual operationcannot compensate for the above defect especially, its practical utilityis low in a three-dimensional database search which requires fullautomatic operation for numerous compounds.

[0016] The inventors of the present invention developed “ADAM,” a methodfor estimating the most stable docking structures of a biopolymer and aligand molecule automatically and without any preconception, and theysuccessfully solved all of the aforementioned problems (PCT/JP93/0365,PCT International Publication WO 93/20525; M. Yamada and A. Itai, Chem.Pharm. Bull., 41, p.1200, 1993; M. Yamada and A. Itai, Chem. Pharm.Bull., 41, p.1203, 1993; and M. Yamada Mizutaui, N. Tomioka and A. Itai,J. Mol. Biol., 243, pp.310-326, 1994).

[0017] In the applications of “ADAM” to several enzyme-inhibitor systemswhose structures had been elucidated by crystallographic analyses, allthe resulted docking models with minimum energy perfectly reproduced thestructure in the crystal, and torsion angles of rotatable bonds in theligand molecule and hydrogen-bonding schemes were also accuratelyreproduced. The high accuracy of “ADAM” is based on the calculations ofstructure optimization (energy minimization) performed altogether threetimes. Though the time required may vary depending on the performance ofa computer, numbers of dummy atoms, number of heteroatoms, number ofrotatable bonds and the like and cannot be generalized, it takes fromseveral minutes through approximately 1 hour to obtain an initialdocking model with an ordinary drug molecule by means of a commonly usedworkstation (R4400). This speed is dozens of times faster than thefastest method which handles possible conformations reported so far.

[0018] However, although ADAM is suitable in searching for the moststable docking structure between the target biopolymer and a singleligand molecule, the method is not suitable for searching novel ligandcompounds from a wide variety of numerous compounds contained in athree-dimensional structure database as it is. Therefore, furtherimprovement has been desired.

[0019] Accordingly, an object of the present invention is to provide amethod suitable for searching novel ligand compounds from a wide varietyof numerous compounds, thereby solving the problems of the state of theart.

DESCRIPTION OF THE INVENTION

[0020] The inventors of the present invention performed variousresearches. As a result, they succeeded in developing a process forselecting one or more promising ligand candidates from an enormousnumber of trial compounds: wherein

[0021] the process comprises the steps of estimating the most stabledocking structure for each trial compound; and selecting one or moreligand candidate compounds from all the trial compounds by givencriteria including the interaction energy between the target biopolymerand the trial compound in the most stable docking structures; and theaforementioned step of estimating the most stable docking structure foreach trial compound comprises the steps of:

[0022] evaluating all possible docking structures generated throughdocking of the trial compound to the biopolymer, while varyingconformation of the trial compound and repeating structuraloptimizations, based on interaction energies between the biopolymer andthe trial compound, e.g., hydrogen bonds, electrostatic interaction, andvan der Waals force. The present invention was achieved on the basis ofthese findings.

[0023] The present invention thus provides a method of searching one ormore ligand compound to a target biopolymer from a three-dimensionalstructure database, which comprises the steps of:

[0024] (i) the first step of assigning hydrogen-bonding categorynumbers, information for calculating force-field energies, andinformation for generating conformations, to two or more trialcompounds, in addition to three-dimensional atomic coordinates thereof;

[0025] (ii) the second step of preparing physicochemical informationabout a ligand-binding region and one or more dummy atoms based on thethree-dimensional atomic coordinates of the target biopolymer;

[0026] (iii) the third step of estimating the most stable dockingstructure, wherein said step further comprises the steps of generatingpossible docking structures by docking a trial compound to thebiopolymer while varying conformations of the trial compound, andevaluating interaction energy between the target biopolymer and thetrial compound, based on the hydrogen-bonding category number, theinformation for calculating force-field energies, and the informationfor generating conformations assigned to the trial compound in additionto the three-dimensional atomic coordinates according to the step (i)and based on the physicochemical information about the ligand-bindingregion prepared in the step (ii): and then;

[0027] (iv) the fourth step of deciding whether or not the trialcompound should be adopted as a ligand-candidate compound based on givencriteria including the interaction energy values between the targetbiopolymer and the trial compound in the most stable docking structurefound according to the step (iii); and

[0028] (v) the fifth step of repeating the step (iii) and the step (iv)for all of the trial compounds.

[0029] The method of the present invention may comprise the sixth stepof further selecting the ligand-candidate compounds selected in the step(iv) based on different criteria including the number of hydrogen bondsformed between the target biopolymer and/or interaction energy valuesderived therefrom.

BRIEF DESCRIPTION OF DRAWINGS

[0030]FIG. 1 shows a conceptual scheme by featuring the method of thepresent invention.

[0031]FIG. 2 is a flowchart according to an embodiment of the method ofthe present invention. In the figure, S represents each step.

[0032]FIG. 3 shows examples of the substrate and known inhibitoranalogues for a dihydrofolate reductase selected as ligand candidatecompounds by the method of the present invention.

[0033]FIG. 4 shows examples of ligand-candidate compounds withnovel-structures for a dihydrofolate reductase obtained by the method ofthe present invention.

BEST MODE FOR CARRYING OUT THE INVENTION

[0034] In the method of the present invention, the trial compounds arenot particularly limited so long as their structures are known. Forexample, various compounds contained in currently available databasescan be used. The hydrogen-bonding category numbers are identificationnumbers for functional groups capable of forming hydrogen bonds, andassigned to heteroatoms that directly participate in hydrogen bonds bythe functional groups. By referring to the number, a geometric structureof the functional group and the nature of the hydrogen bond as well asthe position of partner hydrogen-bonding atoms (dummy atoms) areinstantly generated. The information for calculating force fieldenergies includes an atom type number to discriminate atomic element andthe state of hybridization of each atom that are used for calculatingintramolecular and intermolecular interaction energies by means ofmolecular force field. The information includes an atomic chargeassigned to each atom. The information for preparing conformations isused for systematically generating different conformations by varyingtorsion angles of rotatable bonds, in the range of the given initial andfinal values of the torsion angles by given values of step angle. Itincludes, for one bond to be rotated, a set of four atomic numbersconstituting the torsion angle, and the initial and final values of thetorsion angle and a step angle of rotation.

[0035] The term “biopolymer” means biological macromolecules andincludes artificial mimetic molecules based on biologicalmacromolecules. The term “physicochemical information about theligand-binding region” means potential influence brought by all of theatoms of the biopolymer inside a cavity space of the biopolymer to whicha ligand can bind. The meaning includes hydrogen-bonding properties ofthree-dimensional grid points within the ligand-binding region (hydrogenbond acceptor or hydrogen bond donor), and van der Waals interactionenergies and electrostatic interaction energies generated between thebiopolymer and probe atoms when the probe atoms are placed on thethree-dimensional grid points. The meaning of “docking” is used toinclude the step of forming a complex from a trial compound molecule anda biopolymer, including the step of finding a stable docking structureformed by the both of them. The docking can generally be performed byinteractive methods using computer graphics, as well as by automaticdocking methods as the case may be.

[0036] The term “interaction between a biopolymer and a trial compound”is a force arisen between the biopolymer and the trial compound, whoseexamples include hydrogen bond, electrostatic interaction, van der Waalsinteraction and the like. The term “ligand” or “ligand compound” is acompound with low molecular weight that binds to a biopolymer, whoseexamples include medicaments, substrates and inhibitors of enzymes,coenzymes, and other biologically-active substances.

[0037] According to an embodiment of the method of the presentinvention, the above third step may comprise the following steps:

[0038] (1) step (a) of searching likely binding modes between thebiopolymer and the trial compound by preparing all of the combinatorialcorrespondences between hydrogen-bonding heteroatoms in the trialcompound and dummy atoms placed at the positions of heteroatoms that canbe hydrogen-bonding partners of hydrogen-bonding functional groups inthe biopolymer;

[0039] (2) step (b) of simultaneously estimating possiblehydrogen-bonding schemes between the biopolymer and the trial compoundand conformations of hydrogen-bonding part of the trial compound bycomparing the distances between dummy atoms with the distances betweenthe corresponded hydrogen-bonding heteroatoms while systematicallychanging the conformations of the trial compound; and

[0040] (3) step (c) of obtaining the structure of a docking structurecomprising the biopolymer and the trial compound by converting theatomic coordinates of the trial compound to the coordinate system of thebiopolymer for each hydrogen-bonding scheme and conformation obtained inthe step (b) based on the correspondences between the hydrogen-bondingheteroatoms in the trial compound and the dummy atoms.

[0041] In addition, according to another embodiment of the method of thepresent invention, the above third step may comprise the followingsteps:

[0042] (1) step (a) of searching likely binding modes between thebiopolymer and the trial compound by preparing all of the combinatorialcorrespondences between hydrogen-bonding heteroatoms in a partialstructure of the trial compound and the dummy atoms placed at thepositions of heteroatoms that can be hydrogen-bonding partners ofhydrogen-bonding functional groups in the biopolymer;

[0043] (2) step (b) of simultaneously estimating hydrogen-bondingschemes between the biopolymer and the trial compound and conformationsof the partial structure of the trial compound by comparing thedistances between dummy atoms with the distances between thecorresponded hydrogen-bonding heteroatoms while systematically changingthe conformations of the trial compound;

[0044] (3) step (c) of preserving correspondences of the dummy atoms andthe hydrogen-bonding heteroatoms that provide impossiblehydrogen-bonding schemes in the partial structure of the trial compound,and hydrogen-bonding heteroatoms that cannot form any hydrogen-bond withthe dummy atoms based on the hydrogen-bonding schemes and conformationsobtained in the step (b);

[0045] (4) step (d) of searching likely binding modes between thebiopolymer and the trial compound by preparing all of the combinatorialcorrespondences between dummy atoms and hydrogen-bonding heteroatoms ina the whole structure of the trial compound, excluding thecorrespondences containing the hydrogen-bonding heteroatoms preserved inthe step (c) and the correspondences containing the dummy atoms andhydrogen-bonding heteroatoms preserved in the step (c);

[0046] (5) step (e) of simultaneously estimating possiblehydrogen-bonding schemes between the biopolymer and the trial compoundand conformations of hydrogen-bonding part of the trial compound bycomparing the distances between dummy atoms with the distances betweencorresponded hydrogen-bonding heteroatoms while systematically changingthe conformations of the trial compound; and

[0047] (6) step (f) of obtaining the structure of a docking structurecomprising the biopolymer and the trial compound by converting theatomic coordinates of the trial compound to the coordinate system of thebiopolymer for each hydrogen-bonding scheme and conformations obtainedin the step (e) based on the correspondences between thehydrogen-bonding heteroatoms in the trial compound and the dummy atoms.

[0048] When the third step includes the above sub-steps, it becomespossible to accelerate the search for stable docking structuresconsisting of the biopolymer and the trial compound, and in addition, italso become possible to search for stable docking structures includingthe most stable docking structure in short time even where thebiopolymer and/or the trial compounds have complicated structures.

[0049] According to a further embodiment of the method of the presentinvention, the above third step may comprise the following steps:

[0050] (1) step (a) of searching likely binding modes between thebiopolymer and the trial compound by preparing all of the combinatorialcorrespondences between hydrogen-bonding heteroatoms in the trialcompound and dummy atoms placed at the positions of heteroatoms that canbe hydrogen-bonding partners of hydrogen-bonding-functional groups inthe biopolymer;

[0051] (2) step (b) of simultaneously estimating possiblehydrogen-bonding schemes between the biopolymer and the trial compoundand conformations of hydrogen-bonding part of the trial compound bycomparing the distances between dummy atoms with the distances betweenhydrogen-bonding heteroatoms while systematically changing theconformations of the trial compound;

[0052] (3) step (c) of optimizing the conformations of the trialcompound so that the positions of the dummy atoms and thehydrogen-bonding heteroatoms in the trial compound can be in accord witheach other while retaining the hydrogen-bonding scheme obtained in thestep (b), and then excluding the conformations of the trial compoundhaving high intramolecular energies;

[0053] (4) step (d) of obtaining the structure of a complex comprisingthe biopolymer and the trial compound by converting the entire atomiccoordinates of the trial compound to the coordinate system of thebiopolymer for each conformation not excluded in the step (c) based onthe corresponding relationships between the hydrogen-bonding heteroatomsin the trial compound and the dummy atoms;

[0054] (5) step (e) of excluding from docking structures ofhydrogen-bonding parts obtained in the step (d) the docking structuresin which intramolecular energies of the hydrogen-bonding parts of thetrial compound and intermolecular energies between the biopolymer andthe hydrogen-bonding parts of the trial compound are high, and thencarrying out structure optimizations of the remaining dockingstructures;

[0055] (6) step (f) of obtaining new docking structures comprising thewhole trial compound by generating the conformations ofnon-hydrogen-bonding parts of the trial compound for each dockingstructure obtained in the step (e); and

[0056] (7) step (g) of excluding from the docking structures obtained inthe step (f) the docking structures in which intramolecular energies ofthe whole structure of the trial compound and intermolecular energiesbetween the biopolymer and the trial compound are high, and thencarrying out structure optimizations of the remaining dockingstructures.

[0057] When the third step includes the aforementioned sub-steps, itbecomes possible to select appropriate docking structures even whenreduced numbers of conformations are initially generated, and therebyobtain stable docking structures with high accuracy.

[0058] In the above methods, the hydrogen-bonding functional groupsinclude functional groups or atoms that can participate in theformations of hydrogen bonds. The term “hydrogen-bonding heteroatoms”means heteroatoms which exist in the hydrogen-bonding functional groupspresent in the trial compounds. The term “hydrogen-bonding part” means apartial structure of the trial compound structure which comprises allthe hydrogen-bonding heteroatoms that are corresponded to dummy atoms,and the term “non-hydrogen-bonding part” means a partial structure otherthan the hydrogen-bonding part.

[0059] According to a still further embodiment of the present invention,there is provided a method for preparing a three-dimensional structuredatabase which comprises the step of optionally adding three-dimensionalcoordinates of hydrogen atoms, and adding atom type numbers,hydrogen-bonding category numbers, atomic charge, and modes of bondrotation in addition to three-dimensional coordinates of trialcompounds.

[0060] Preferred embodiments of the present invention will be explainedbelow by referring to the flowchart shown in FIG. 2. In FIG. 2, thesymbol “S” represents each step and the name “ADAM” represents themethod for searching a stable docking structure of biopolymer and ligandmolecules disclosed in International Publication WO93/20525(International Publication Date: October, 14, 1993). The disclosure ofthe specification and claims of the International Publication WO93/20525is herein incorporated by reference.

[0061] The “ADAM” format three-dimensional database is first preparedwhich comprises at least two trial compounds. The aforementioneddatabase can be prepared, for example, by the method explained below.Three-dimensional coordinates of trial compounds are first inputted. Asthe three-dimensional coordinates of the trial compounds, coordinatesobtained by converting two-dimensional coordinates obtainable from atwo-dimensional database to a three-dimensional coordinates throughforce-field calculation, or coordinates obtained from structuralanalysis of a single crystal as well as those from crystal database, orthose from three-dimensional structures obtained by model buildingprocedure based on energy calculation may be used. As regardsthree-dimensional structures of each trial compound, conformations arebasically not necessary to be taken into account so long as geometricvalues such as bond distances and bond angles between the atoms areaccurate. However, in order to calculate atomic charge of each atom inthe trial compound as accurately as possible, it is preferred to inputatomic coordinates of a relatively stable structure.

[0062] After inputting the above three-dimensional coordinates of thetrial compounds, a three-dimensional database can be preparedautomatically by adding hydrogen atoms if necessary, and assigning aforce-field atom-type number to each atom, hydrogen-bonding categorynumbers to hydrogen-bonding heteroatoms, atomic charge to each atom, andmodes of bond rotations (for example, information about covalent bondsto be rotated, values of initial, final, and increment torsion angles ofthe rotation).

[0063] The force-field atom-type numbers for each atom in the trialcompound can be assigned, for example, according to the numbering methodof Weiner et al. (Weiner, S. P. A., Case, D. A., Singh, U. C., Ghio. C.,Alagona, G., Profeta, S., Jr., and . Am. Chem. Soc., 106, 1984,pp.765-784). The hydrogen-bonding ers for hydrogen-bonding heteroatomsin the trial compound can be rding to the following Table 1. The atomiccharge of each atom in the d can be calculated, for example, by theGasteiger method or by ital calculation with MNDO, AM1 or other methodin the MOPAC for the mode of bond rotation, it is preferred to indicateto rotate all s systematically with rotation-angle intervals of 60° to120°. TABLE 1 A List of Hydrogen-Bonding Hereroatoms and theirHydrogen-Bonding Category Numbers included in Hydrogen-BondingFunctional Groups Hydrogen-Bonding Heteroatoms included inHydrogen-Bonding Category Number Functional Groups 1 sp² N in PrimaryAmines 2 sp³ N in Primary Amines 3 sp³ N in Ammonium Ions 4 sp² N inAmides 5 sp³ N in Secondary Amines 6 N in Aromatic Groups 7 Protonated Nin Aromatic Groups 8 N in Tertiary Amines 9 Protonated N in TertiaryAmines 10 O in Hydroxyls with Rotatable C—O bond 11 O in Ethers 12 O inCarbonyls 13 O in Carboxylate Anions 14 O in Carboxylic Acids 15 O inPhosphates 16 O in Water Molecules 17 S in Merc{dot over (a)}ptos 18 Sin Thioethers 19 O in Hydroxyls with fixed hydrogen atom positions

[0064] Then, the number of atoms in the biopolymer and their atomiccoordinates (except for atomic coordinates of hydrogen atoms) areinputted (S2). As regards the atomic coordinates of the biopolymer,three-dimensional information obtained by X-ray crystallographicanalyses or NMR analyses, information obtainable from the proteincrystal structure database and the like, or alternatively, atomiccoordinates of a biopolymer model constructed on the basis of suchinformation or the like may be used. The atomic coordinates of thebiopolymer are preferably provided as a three-dimensional coordinatesystem. In addition, cofactors that bind to the biopolymer and thatstructurally as well as functionally play important roles may beconsidered as a part of the biopolymer, and the atomic coordinates ofthe biopolymer in the state of being bound with the cofactor may beinputted to perform the successive steps. Examples of the cofactorsinclude coenzymes, water molecules, metal ions and the like.

[0065] Following the above step, the atomic coordinates of the hydrogenatoms in amino acid residues present in the biopolymer are calculated(S3). In general, it is difficult to determine the positions of hydrogenatoms in biopolymers by experimental procedures such as X-raycrystallographic analyses or NMR analyses. In addition, the informationabout hydrogen atoms is not available from protein crystal structuredatabases or the like. Therefore, it is necessary to calculate theatomic coordinates of the hydrogen atoms in amino acid residues on thebasis of the structures of the amino acid residues present in thebiopolymer. For hydrogen atoms whose atomic coordinates cannot beuniquely determined as they bind to rotatable group of atoms in theamino acid residues, their atomic coordinates are preferably calculatedon the assumption that they exist at transpositions.

[0066] Subsequently, atomic charges are appended to the atoms in aminoacid residues present in the biopolymer (S4), and then hydrogen-bondingcategory numbers are assigned to the heteroatoms in hydrogen-bondingfunctional groups present in the biopolymer (S5). As the values of theatomic charges, reported values calculated for each amino acid, forexample, values reported by Weiner et al. can be used (Weiner, S. J.,Kollman, P.A., Case, D. A., Singh, U. C., Ghio, C., Alagona, G.,Profeta, S., Jr. and Weiner, P., J. Am. Chem. Soc., 106, 1984, pp.765-784). The hydrogen-bonding category numbers can be assignedaccording to the Table 1.

[0067] A ligand-binding region is then specified (S6). As theligand-binding region, spaces containing any part of the biopolymer canbe specified by using preferably a rectangular box. Depending on thepurposes, a ligand-binding pocket and its surrounding region of thebiopolymer can be specified. If desired, a region containing any site inthe biopolymer which binds a different molecule such as an effector andthe like can also be specified. The term “ligand-binding region” means acavity on the concaved surface of the biopolymer to which a ligandmolecule such as a substrate or an inhibitor bind. For assigning therange of the ligand-binding region, a part of the functions of theprogram “GREEN” can be used (Tomioka, N., Itai, A. and Iitaka, Y.,Journal of Computer-Aided Molecular Design, vol. 1, 1987, pp.197-210).

[0068] Three-dimensional grid points are generated inside theligand-binding region specified in the step S6, and for eachthree-dimensional grid point, an address number is assigned and gridpoint information is calculated (S7). The term “three-dimensional gridpoints” means the points on the three-dimensional lattice generated at acertain regular intervals within the ligand-binding region in thebiopolymer. The meaning of “grid point information” includes localphysicochemical character at each grid point in the ligand-bindingregion such as van der Waals interaction energies and electrostaticinteraction energies arisen between the biopolymer and probe atoms, aswell as hydrogen-binding properties. Wherein the van der Waals andelectrostatic interaction energies are calculated on the assumption thatprobe atoms exist on each three-dimensional grid point. By using thegrid point information on the three-dimensional grid points, it becomespossible to accelerate approximate calculations of the intermolecularinteractions between the biopolymer and the trial compound to be carriedout in the successive steps. In addition, reasonable determination ofthe positions of the dummy atoms, which is provided in the followingsteps, becomes achievable. As a result, possible docking structurescomprising the biopolymer and the trial compound can be entirelysearched in a short time.

[0069] The three-dimensional grid points may be generated at intervalsof 0.3-2.0 angstroms, preferably 0.3-0.5 angstroms in the regionspecified in the step S6. As the probe atoms, all sorts of atomscontained in possible ligand compounds are preferably included. The vander Waals interaction energy that acts between each atom placed on thethree-dimensional grid points and the biopolymer can be calculatedaccording to a conventional atom-pair type calculation using anempirical potential function. As the empirical potential function, theLennard-Jones type function represented by the following equation can beused.$G_{{v\quad d\quad w},i} = {\sum\limits_{j}( {{A/r_{ij}^{12}} - {B/r_{ij}^{6}}} )}$

[0070] In the formula, symbol “i” is a sequential number representing aprobe atom; and symbol “j” is a sequential number for the biopolymeratoms. Symbols “A” and “B” are parameters for determining the locationand the magnitude of minimum potential. Symbol “r_(ij)” is aninter-atomic distance between the i-th probe atom and the j-th atom inthe biopolymer. As the parameters “A” and “B,” the values reported byWeiner et al. can be used (Weiner, S. J. Kollman, P. A., Case, D. A.,Singh, U. C., Ghio, C. Alagona, G., Profeta, S., Jr. and Weiner, P., J.Am. Chem. Soc., 106, 1984, pp.765-784).

[0071] The van der Waals interaction energy between the trial compoundand the biopolymer, when the trial compound is placed on thethree-dimensional grid points, can be calculated by the followingequation:$E_{v\quad d\quad w} = {\sum\limits_{m = 1}^{N}G_{{v\quad d\quad w},m_{0}}}$

[0072] wherein “m” is the sequential number for each atom in the trialcompound; N is the total number of atoms in the trial compound; and “m₀”is the sequential number of the three-dimensional grid point that isclosest to the m-th atom.

[0073] The electrostatic interaction energy that generates between eachprobe atom placed on each three-dimensional grid point and thebiopolymer can be calculated by the following equation:$G_{{e\quad l\quad c},i} = {\sum\limits_{j}{K{{\overset{\_}{\quad q}}_{j}/ɛ}\quad r_{ij}}}$

[0074] wherein symbols “i,” “j,” and “r_(ij)” are the same as thosedefined above; “q_(j)” is the atomic charge of the j-th atom in thebiopolymer; “K” is a constant for converting an energy unit; and “ε” isa dielectric constant. As the dielectric constant, a fixed value may beused, or alternatively, it is preferably to use a value depending onr_(ij), as proposed by Warshel et al. (Warshel, A., J. Phys. Chem., 83,1979, pp.1640-1652).

[0075] The electrostatic interaction energy between the trial compoundand the biopolymer, when the trial compound is placed on thethree-dimensional grid points, can be calculated by the followingequation:$E_{e\quad l\quad c} = {\sum\limits_{m = 1}^{N}{q_{m}G_{{e\quad l\quad c},m_{0}}}}$

[0076] wherein m, N and m₀ are as defined above.

[0077] The term “hydrogen-bonding grid point information” meansinformation indicating that either hydrogen-donor or hydrogen-acceptoratom can form a hydrogen bond with the hydrogen-bonding functional groupin the biopolymer if it is placed on the three-dimensional grid point,or alternatively that both hydrogen-donor and hydrogen-acceptor atom can(or cannot) form a hydrogen bond with the hydrogen-bonding functionalgroup in the biopolymer if it is placed on the three-dimensional gridpoint. For example, the hydrogen-bonding information ofthree-dimensional grid points can be expressed by assigning number 1 fora three-dimensional grid point as a hydrogen-bond acceptor, number 2 fora three-dimensional grid point as a hydrogen-bond donor, number 3 on athree-dimensional grid having both properties, and number 0 on athree-dimensional grid having no hydrogen-bonding property.

[0078] The hydrogen-bonding grid point information can be obtained asset forth below. If the distance “DP” between a three-dimensional gridpoint “P” and a hydrogen-donating atom “D” in the biopolymer is within arange (e.g., 2.5-3.1 Å) that allows for the formation of a hydrogenbond, and if the angle ∠DHP made by the three-dimensional grid point“P”, hydrogen “H,” and atom “D” is within a range (e.g., more than 30°)that allows for the formation of a hydrogen bond, the three-dimensionalgrid point is determined to have property of hydrogen-bond acceptor.Similarly, if the distance “PA” between three-dimensional grid point “P”and hydrogen-accepting atom “A” in the biopolymer is within a range thatallows for the formation of a hydrogen bond and if the angle ∠ALP whichis made by the three-dimensional grid point “P,” a lone-pair electron“L,” and the hydrogen-accepting atom “A” is within a range that allowsfor the formation of a hydrogen bond, the three-dimensional grid pointis determined to have the property of hydrogen-bond acceptor. If athree-dimensional grid point is proved to have neither the property ofhydrogen-bond donor nor hydrogen-bond acceptor, the point is assumed tohave no hydrogen-bonding property.

[0079] Hydrogen-bonding functional groups that are expected to formhydrogen bonds with the trial compound are selected fromhydrogen-bonding functional groups present in the region of thebiopolymer specified in the step S6 (S8). If a lot of hydrogen-bondingfunctional groups are present, they can be selected depending on thedegree of importance. Furthermore, one or more dummy atoms are set toeach hydrogen-bonding functional group selected in step S8 based on thethree-dimensional grid point information calculated in S7 (S9).

[0080] In this step, a region in which a hydrogen-bond can be formedwith each hydrogen-bonding functional group selected in step S8(hydrogen-bonding region) is first determined based on thehydrogen-bonding grid point information calculated in step S7, and thena dummy atom is placed at the center of the hydrogen-bonding regioncomprising appropriate numbers of three-dimensional grid points, e.g.,5-20. The dummy atom is positioned within the hydrogen-bonding regionand outside the van der Waals radii of other atoms. The hydrogen-bondingregion is defined as a group of three-dimensional grid points that areneighboring to each other and have the same hydrogen-bonding properties.It should be noted that two or more dummy atoms may sometimes be placedfrom a single hydrogen-bonding functional group, or in contrast, nodummy atom may be generated. If the number of generated dummy atoms islarge, it is preferred to reduce the number to 10 or less, morepreferably 5 to 10, by selecting dummy atoms present in a bottom of thecavity of the ligand-bonding region considering the importance. To eachdummy atom, the same hydrogen-bond property is assigned as that of thehydrogen-bonding region to which the dummy atom belongs.

[0081] Then, a trial compound is selected (S10) and thenthree-dimensional coordinates, hydrogen-bonding category numbers,information for molecular force-field calculation, and information forconformation generation are read from the “ADAM” format databaseprepared in S1 (S11). Then, correspondences are made combinatoriallybetween the dummy atoms and the hydrogen-bonding heteroatoms in thetrial compound (S12). If the number of dummy atoms is “m” and the-numberof hydrogen-bonding heteroatoms in the trial compound is “n”, the numberof the correspondences (combination) represented by N(i), in whichnumber of hydrogen-bonds formed is “i”, is mPi×nCi wherein symbol “P”represents permutations and symbol “C” represents combinations. It ispreferred to generate all of the possible combinations of the dummyatoms and the hydrogen-binding heteroatoms in the trial compound.

[0082] Then, for all i's that satisfy the relationshipl_(min)≦i≦l_(max), the steps S12-S39 are repeated. As a result, a totalof $\sum\limits_{i = I_{\max}}^{I_{\max}}\quad {N(i)}$

[0083] sets of correspondences of dummy atoms and hydrogen-bondingheteroatoms in the trial compound are generated. By these procedures,all possible combinations of the hydrogen-bonds formed between thebiopolymer and the trial compound are generated, and thereby all thebinding modes between the trial compound and the biopolymer can besearched systematically and efficiently. It is preferred to set l_(max)as the smaller number of the dummy atoms or the hydrogen-bondingheteroatoms, and set l_(min) as the number smaller than l_(max) by 2-4.

[0084] More specifically, one of the correspondences of the dummy atomsand the hydrogen-binding heteroatoms generated in S12 is selected (S13).In this step, correspondences in which the hydrogen-bonding propertiesof the dummy atoms do not match those of the hydrogen-bondingheteroatoms in the trial compound are not selected. Then, interatomicdistances are calculated for all pairs of dummy atoms contained in thecorrespondence selected in S13 (S14). If both the number of dummy atomsand the number of hydrogen-bonding heteroatoms in the trial compound are1, the process jumps to step S22 skipping steps S14-S21 and then jumpsto step S26 skipping steps S23-S25. If either of the number of dummyatoms or the number of hydrogen-bonding heteroatoms in the trialcompound is 1, the process jumps to step S22 skipping steps S14-S21.

[0085] The trial compound molecule is divided into a hydrogen-bondingpart and a non-hydrogen-bonding part (S15), and rotatable bonds andtheir rotation modes in the hydrogen-bonding part divided in step S15are selected (S16). Then, conformations of the trial compound aregenerated successively by rotating the bonds selected in step S16according to the rotation modes inputted in the three-dimensional “ADAM”format database prepared in S1 (S17). For each generated conformation,subsequent steps S18-S29 are performed.

[0086] These steps will be detailed below. For each conformationgenerated in step S17, interatomic distances are calculated for allpairs of hydrogen-bonding heteroatoms in the trial compound contained inthe correspondence selected in step S13 (S18). The interatomic distancesof hydrogen-bonding heteroatoms are compared with those of correspondeddummy atoms. If the value of “F”, which is a sum of the squares of thedifferences between the dummy-atom/dummy-atom distances as calculated instep S14 and the heteroatom/heteroatom distances of the trial compoundas calculated in step 18, is outside a given range, the correspondenceis excluded as inappropriate in the conformation (S19). That is, when Fis calculated by the following equation:$F = {\sum\limits_{i = 1}^{{n{({n - 1})}}/2}( {r_{l\quad i} - r_{d\quad i}} )^{2}}$

[0087] hydrogen-bonding correspondences and conformations of the trialcompound which do not satisfy the following condition are excluded:

k _(a) ≦F≦k _(a)′

[0088] wherein “n” is the number of hydrogen bonds; “r_(di)” is the i-thdistance between dummy atoms; and “r_(li)” is the i-th distance betweenhydrogen-bonding heteroatoms in the trial compound; “k_(a)” is the lowerlimit of F; and “k_(a)′” is the upper limit of F. The value of k_(a) ispreferably 0.6-0.8 and that of k_(a)′ is preferably 1.2-1.4. In thisstep, possible hydrogen-bonding schemes between the biopolymer and thetrial compounds and possible conformations of the ligand molecule can besearched efficiently and thoroughly.

[0089] Then, for the hydrogen-bonding correspondences and conformationsnot excluded in step S19, the conformation of the hydrogen-bonding partof the trial compound is optimized by minimizing the value of F (S20).With this step, the conformation of the trial compound is corrected sothat the biopolymer and the trial compound can form stable hydrogenbonds. The conformation of the hydrogen-bonding part of the trialcompound can be optimized by minimizing the value of F using theFletcher-Powell method (Fletcher, R., Powell, M. J. D., Computer J., 6,1968, p.163), treating the torsion angles of the rotatable bonds presentin the hydrogen-bonding part as variables.

[0090] Then, the intramolecular energy is calculated for thehydrogen-bonding part of the trial compound optimized in step S20, andthe conformations with intramolecular energies higher than giventhreshold are excluded (S21). For example, when the intramolecularenergy of the hydrogen-bonding part of the trial compound is calculatedby using the molecular force field of AMBER 4.0, conformations with anintramolecular energy of more than 100 kcal/mol are preferably excluded.Then, the atomic coordinates of the trial compound is converted to thecoordinate system of the biopolymer so that the coordinates of thehydrogen-bonding heteroatoms in the trial compound in the conformationobtained in step S21 coincide with those of the corresponded dummy atoms(S22). The Kabsh's method of least squares can be used for this step(Kabsh, W., Acta Cryst., A32, 1976, p.922; Kabsh W., Acta Cryst., A34,1978, p.827). By these procedures, likely hydrogen-bonding schemes andthe conformations of the hydrogen-bonding part of the trial compound canbe estimated roughly at the same time.

[0091] Following the above step, the intermolecular interaction energybetween the biopolymer and the hydrogen-bonding part of the trialcompound (the sum of a van der Waals interaction energy and anelectrostatic interaction energy) and the intramolecular energy of thehydrogen-bonding part of the trial compound are calculated (S23). Theintermolecular interaction energy between the biopolymer and thehydrogen-bonding part of the trial compound E_(inter) can be calculatedby the following equation:$E_{i\quad n\quad t\quad e\quad r} = {\sum\limits_{k}\lbrack {{G_{v\quad d\quad w}( k_{0} )} + {{G_{e\quad l\quad c}( k_{0} )} \cdot q_{k}}} \rbrack}$

[0092] wherein “k” is a sequential number for the atoms in the trialcompound; “G_(vdw)(k₀)” and “G_(elc)(k₀)” are the van der Waals and theelectrostatic interaction energies of the grid point information at thegrid point at k₀ closest to the atom “k”; “qk” is the atomic charge ofthe atom k.

[0093] The intramolecular energy of the hydrogen-bonding part of thetrial compound E_(intra,) can be calculated by known methods. Forexample, E_(intra) can be calculated by the following equation using areported force field such as AMBER4.0:$E_{i\quad n\quad t\quad r\quad a} = {{\sum\limits_{d\quad i\quad h\quad e\quad d\quad r\quad a\quad l\quad s}{\sum\limits_{n}{\frac{V_{n}}{2}\lbrack {1 + {\cos ( {{n\quad \Phi} - \gamma} )}} \rbrack}}} + {\sum\limits_{i < j}( {{A_{ij}/R_{ij}^{12}} - {B_{ij}R_{ij}^{6}}} )} + {\sum\limits_{i < j}( {q_{i}{q_{j}/ɛ}\quad R_{ij}} )}}$

[0094] wherein “V_(a)” is a constant provided for force-field atom typesof four atoms constituting a torsion angle; “n” is a constantrepresenting the symmetry of the torsion-angle potential; Φ is a torsionangle, γ is the phase of the torsion-angle potential (provided forforce-field atom types of four atoms constituting a torsion angle);“A_(ij)” and “B_(ij)” are constants for a pair of force-field atom typesof the i-th and j-th atoms in the trial compound; “R_(ij)” is thedistance between the i-th and j-th atoms; q_(i) is the atomic charge onthe i-th atom in the trial compound; q_(j) is the atomic charge on thej-th atom in the trial compound, and ε is a dielectric constant.

[0095] Conformations of the trial compound are excluded if the sum ofthe intermolecular interaction energy and intramolecular energy ascalculated in step S23 is higher than a given threshold (S24). Forexample, conformations in which the sum of these energies is higher than10 kcal/mol per 100 dalton of molecular weight can be excluded. Then,the structure of the hydrogen-bonding part of the trial compound isoptimized if it was not excluded in step S24 (S25). The structures ofthe hydrogen-bonding part of the trial compound can be optimized byoptimizing the torsion angles of the hydrogen-bonding part of the trialcompound as well as the relative location and orientation of the trialcompound.

[0096] For example, the structure of the hydrogen-bonding part of thetrial compound can be optimized by the Powell method (Press, W. H.,Flannery, B. P., Teukolsky, S. A., Vitterling, W. T., “Numerical Recipesin C”, Cambridge University Press, Cambridge, 1989) or the like, so thatthe total energy E_(total) calculated by the following formula isminimized:

E _(total) =E _(inter) +E _(intra) +W _(hb) ·N _(hb) ·C _(hb)

[0097] wherein E_(inter) and E_(intra) are the same as those definedabove; “W_(hb)” is a weight; “N_(hb)” is the number of hydrogen-bonds;and “C_(hb)” is a constant energy value assigned for one hydrogen-bond(e.g., 2.5 kcal/mol).

[0098] Then, conformations of the trial compounds are successivelygenerated by rotating the rotatable bonds in the non-hydrogen-bondingpart of the trial compound according to the rotation mode inputted inthe “ADAM” format three-dimensional database prepared in S1 (S26), andsteps S27-S30 are repeated for each generated conformation.Specifically, the intermolecular interaction energy between thebiopolymer and the trial compound and the intramolecular energy of thetrial compound are calculated (S27), and conformations of the trialcompound are excluded if the sum of the intermolecular interactionenergy and intramolecular energy as calculated in S27 is higher than agiven threshold (S28). The entire structure of the trial compound havingconformations not excluded in step S28 are then optimized (S29), and theenergies are calculated for docking structures of the biopolymer and thetrial compound obtained by S29, and the most stable docking structurewith the minimum energy is selected (S30).

[0099] The intermolecular interaction energy and intramolecular energycan be calculated in the same manner as S23, provided that energiesshould be calculated for the whole structure of the trial compound inS27, whereas energies are calculated only for the hydrogen-bonding partof the trial compound in S23. As the upper limit for the sum of theseenergies, for example 10 kcal/mol per 100 dalton of molecular weight canbe used. Through this step, stable docking structures comprising thebiopolymer and the trial compound as well as active conformations of thetrial compound can be obtained. The optimization of the structures ofthe trial compound can be carried out in the same manner as S25,provided that the optimization should be performed for the wholestructure of the trial compound in S29, whereas the optimization isdirected only to the hydrogen-bonding part of the trial compound in S25.

[0100] The docking process of the biopolymer and the trial compound asdescribed above can be carried out by using the automated dockingprogram “ADAM” (PCT International Publication WO93/20525; Yamada, M. andItai, A., Chem. Pharm. Bull., 41, 1993, p.1200; Yamada, M. and Itai, A.,Chem. Pharm. Bull., 41, 1993, p.1203; Yamada Mizutani, M., Tomioka, N.,and Itai, A., J. Mol. Biol., 243, 1994, pp.310-326).

[0101] Whether or not the trial compound should be selected as aligand-candidate compound is decided based on all or part of the givencriteria including the intermolecular interaction energy between thebiopolymer and the trial compound, the intramolecular energy of thetrial compound, number of hydrogen bonds, number of rings and others,which is evaluated in the most stable docking structure obtained in S30(S31). Examples of the intermolecular interaction energy includeelectrostatic interaction energy, van der Waals interaction, hydrogenbond energy, and sum of them. For example, a trial compound with sum ofthe electrostatic and van der Waals interaction energies below −2kcal/mol per molecular weight of 100 dalton can be selected as aligand-candidate compound. Then, judgement is performed whether or notall of the trial compounds have been subjected to the selection (S32),and if not all of the trial compounds have been subjected to theselection, the steps of S10-S31 are repeated by returning to the stepS10, and if the selection has been completed, the process proceeds tothe step of S33.

[0102] More promising ligand-candidate compounds are further selectedfrom the ligand-candidate compounds selected in S31 (S33). Prior to thefurther selection, it is convenient to decide another criteria for thefurther selection by referring to a list of information about theligand-candidate compounds in their most stable docking structures,e.g., the intermolecular interaction energy between the biopolymer andthe trial compound, the intramolecular energy of the trial compound, thenumber of hydrogen bonds, and the number of rings and others. Forexample, a further selection is preferably carried out using more severecriteria comprised of the number of hydrogen bonds that are formedbetween the biopolymer and the ligand-candidate compound, theintermolecular interaction energy between the biopolymer and theligand-candidate compound, and the number of atoms.

[0103] Examples of the intermolecular interaction energy includeelectrostatic interaction energy, van der Waals interaction, and sum ofthem. For example, trial compounds with the sum of the electrostatic andvan der Waals interaction energies below −8 kcal/mol per molecularweight of 100 dalton are preferably selected as ligand-candidatecompounds. Although arbitrary number can be specified as a criterion ofthe number of hydrogen bonds depending on the sort of the biopolymer, itis desirable to apply a number of, for example, 2 or more, morepreferably 3 or more as the criterion. Criterion about the number ofatoms may also be arbitrary, but the number of, for example, 20 or more,preferably in the range of 20 to 40 can be specified as the criterion.

[0104] If too many dummy atoms are generated from the biopolymer, or ifthe trial compound has rather high conformational flexibility or has alot of heteroatoms, according to a more preferred embodiment of thepresent invention, information about the trial compound is inputtedaccording to the step S9, and then steps S13-S25 are carried out for apartial structure of the trial compound and preserve information aboutcorrespondences of dummy atoms and hydrogen-bonding heteroatoms of thetrial compound that provide impossible hydrogen-bonding schemes andhydrogen-bonding heteroatoms that cannot be corresponded with any dummyatom in the partial structure. Although the partial structure of thetrial compound can be arbitrarily specified without any structuralrestrictions, a partial structure containing at least threehydrogen-bonding functional groups is preferred. Presence or absence ofconformational flexibility in the partial structure may be disregarded.

[0105] Subsequent to the above step, the steps of S13-S33 are performedfor the whole structure of the trial compound, excluding, from the setof correspondences prepared in the step S12, the correspondencescontaining the hydrogen-bonding heteroatoms and the correspondences ofdummy atoms and hydrogen-bonding heteroatoms preserved in the previousstep. As a result, the number of the correspondences and conformationsto be examined can be reduced, thereby time for structural searching forstable docking structure of the biopolymer and the trial compound can begreatly shortened (this method is sometimes referred to as the“Pre-Pruning method (PP method)”). The PP method is based on theassumption that any hydrogen-bonding schemes and ligand conformationsthat are impossible in the partial structure of the trial compound arealso impossible in the whole structure of the trial compound. Byapplying the PP method, the time for the search can be remarkablyshortened without affecting the accuracy and reliability of the resulteddocking structure of the biopolymer and the trial compound.

EXAMPLE

[0106] In order to verify the effectiveness of the method of the presentinvention, calculations were carried out using a dihydrofolate reductaseas the target biopolymer as to whether or not the substrates or knowninhibitors, or analogues thereof can be selected as ligand-candidatecompounds.

[0107] As the atomic coordinates of the dihydrofolate reductase(hereinafter referred to as “DHFR”), the values obtained from the atomiccoordinates of the binary complex of E. coli DHFR and methotrexateavailable from the Protein Data Back entry 4DFR were used. The atomiccoordinates of a cofactor NADP⁺ molecule were added to the atomiccoordinates of the binary complex on the basis of the crystal structureof the ternary complex of DHFR, folic acid, and NADP⁺. All watermolecules were eliminated from the atomic coordinates of the binarycomplex, except for the two water molecules that strongly bind to DHFRand cannot be chemically removed. One of the two water molecules noteliminated binds to tryptophan 21 and aspartic acid 26 of DHFR throughhydrogen-bonds, and the other binds to leucine 114 and threonine 116.

[0108] As a database of trial compounds, the Available ChemicalsDirectory (ACD-3D, MDL) containing about 110,000 molecule data was used.Among the compounds stored in the database, the following molecules wereexcluded: molecules with 5 or less atoms; molecules consist of only H,C, Si and halogen atoms; molecules containing atoms other than H, C, N,O, S, P and halogen atoms; molecules with 6 or more rotatable bonds; andmolecules having no ring structure, and 10,017 compounds were subjectedto the search. The data were supplemented with hydrogen atoms, andforce-field atom-type numbers, hydrogen-bonding category numbers, atomiccharge, and bond-rotation modes (rotation with a rotation angle of 120°)were assigned. The criteria for the selection of ligand-candidatecompounds were as follows: the intermolecular interaction energy betweenthe target biopolymer (the enzyme) of −7 kcal/mol or less per molecularweight of 100 dalton, 3 or more intermolecular hydrogen bonds, and 25 ormore number of atoms.

[0109] According to the method of the present invention, about 1400trial compounds were finally selected as ligand-candidate compounds, andthe process required about 80 hours from inputting of three-dimensionalcoordinates of the biopolymer until the completion of the selection ofthe ligand-candidate compounds. These ligand-candidate compoundsincluded the compounds shown in FIG. 3, together with other numerousanalogues to substrates and known inhibitors. An inhibitor,methotrexate, and the substrate, dihydrofolate, were not selected fromthis search because of the limitation about the number of rotatablebonds, while trimethoprim was selected as a ligand candidate compound.The binding modes and the conformations of the ligand candidatecompounds selected by the method of the present invention agreed wellwith the binding modes and the conformations deduced from the crystalstructures of the complexes comprising the substrate or the inhibitoranalogue together with the enzyme.

[0110] According to the method of the present invention, numerousligand-candidate compounds with novel structures quite dissimilar to thesubstrate and the known inhibitors were selected. A part of examplesthereof are shown in FIG. 4 together with the estimated dockingstructures. In the figure, the bold line indicates the molecularskeletons of the selected ligand-candidates, the dotted line indicateshydrogen bonds, the fine solid line indicates a part of the biopolymer,and the cage depicted with the broken line indicates regions allowed forthe centers of the atoms of the ligand. From these docking structures,it was found that that several hydrogen bonds were formed and highcomplementarities with the enzyme cavity were achieved by selectedligand-candidates. Therefore, these ligand candidate compounds arepromising lead compounds as inhibitors against dihydrofolate reductase.

INDUSTRIAL APPLICABILITY

[0111] According to the method of the present invention, ligandcompounds to the target biopolymer whose structure is already known, canbe identified from a three-dimensional structure database in a shorttime, while considering conformational flexibility, thereby extremelyreliable searching results can be easily obtained. The method of thepresent invention is thus useful in the field of medical and biologicalsciences for the creation of lead compounds having activities asinhibitors, agonists, or antagonist to biopolymers.

What is claimed is:
 1. A method of searching one or more ligand compoundto a target biopolymer from a three-dimensional structure database,which comprises the steps of: (i) the first step of assigninghydrogen-bonding category numbers, information for calculatingforce-field energy, and information for generating conformations, to twoor more trial compounds, in addition to atomic three-dimensionalcoordinates thereof; (ii) the second step of preparing physicochemicalinformation about a ligand-binding region and one or more dummy atomsbased on the three-dimensional atomic coordinates of the targetbiopolymer; (iii) the third step of estimating the most stable dockingstructure through structural optimization, wherein said step furthercomprises the steps of generating possible docking structures by dockinga trial compound to the biopolymer while varying conformations of thetrial compound, and evaluating interaction energy between the targetbiopolymer and the trial compound, based on the hydrogen-bondingcategory number, the information for calculating force-field energy, andthe information for generating conformations assigned to the trialcompound in addition to the three-dimensional coordinates according tothe step (i) and based on the physicochemical information about theligand-binding region prepared in the step (ii): (iv) the fourth step ofdeciding whether or not the trial compound should be adopted as a ligandcandidate compound based on given criteria including the interactionenergy values between the target biopolymer and the trial compound inthe most stable docking structure found according to the step (iii); and(v) the fifth step of repeating the step (iii) and the step (iv) for allof the trial compounds.
 2. The method according to claim 1, wherein thethird step further comprises the steps of: (1) step (a) of searchinglikely binding modes between the biopolymer and the trial compound bypreparing all of the combinatorial correspondences betweenhydrogen-bonding heteroatoms in the trial compound and dummy atomsplaced at the positions of heteroatoms that can be hydrogen-bondingpartners of hydrogen-bonding functional groups in the biopolymer; (2)step (b) of simultaneously estimating possible hydrogen-bonding schemesbetween the biopolymer and the trial compound and conformations ofhydrogen-bonding part of the trial compound by comparing the distancesbetween dummy atoms with the distances between the correspondedhydrogen-bonding heteroatoms while systematically changing theconformations of the trial compound; and (3) step (c) of obtaining thestructure of a docking structure comprising the biopolymer and the trialcompound by converting the atomic coordinates of the trial compound tothe coordinate system of the biopolymer for each hydrogen-bonding schemeand conformation obtained in the step (b) based on the correspondencesbetween the hydrogen-bonding heteroatoms in the trial compound and thedummy atoms.
 3. The method according to claim 1, wherein the third stepfurther comprises the steps of: (1) step (a) of searching likely bindingmodes between the biopolymer and the trial compound by preparing all ofthe combinatorial correspondences between hydrogen-bonding heteroatomsin a partial structure of the trial compound and the dummy atoms placedat the positions of heteroatoms that can be hydrogen-bonding partners ofhydrogen-bonding functional groups in the biopolymer; (2) step (b) ofsimultaneously estimating hydrogen-bonding schemes between thebiopolymer and the trial compound and conformations of the partialstructure of the trial compound by comparing the distances between dummyatoms with the distances between the corresponded hydrogen-bondingheteroatoms while systematically changing the conformations of the trialcompound; (3) step (c) of preserving correspondences of the dummy atomsand the hydrogen-bonding heteroatoms that provide impossiblehydrogen-bonding schemes in the partial structure of the trial compound,and hydrogen-bonding heteroatoms that cannot form any hydrogen-bond withthe dummy atoms based on the hydrogen-bonding schemes and conformationsobtained in the step (b); (4) step (d) of searching likely binding modesbetween the biopolymer and the trial compound by preparing all of thecombinatorial correspondences between dummy atoms and hydrogen-bondingheteroatoms in a the whole structure of the trial compound, excludingthe correspondences containing the hydrogen-bonding heteroatomspreserved in the step (c) and the correspondences containing the dummyatoms and hydrogen-bonding heteroatoms preserved in the step (c); (5)step (e) of simultaneously estimating possible hydrogen-bonding schemesbetween the biopolymer and the trial compound and conformations ofhydrogen-bonding part of the trial compound by comparing the distancesbetween dummy atoms with the distances between correspondedhydrogen-bonding heteroatoms while systematically changing theconformations of the trial compound; and (6) step (f) of obtaining thestructure of a docking structure comprising the biopolymer and the trialcompound by converting the atomic coordinates of the trial compound tothe coordinate system of the biopolymer for each hydrogen-bonding schemeand conformations obtained in the step (e) based on the correspondencesbetween the hydrogen-bonding heteroatoms in the trial compound and thedummy atoms.
 4. The method according to claim 1, wherein the third stepfurther comprises the steps of: (1) step (a) of searching likely bindingmodes between the biopolymer and the trial compound by preparing all ofthe combinatorial correspondences between hydrogen-bonding heteroatomsin the trial compound and dummy atoms placed at the positions ofheteroatoms that can be hydrogen-bonding partners of hydrogen-bondingfunctional groups in the biopolymer; (2) step (b) of simultaneouslyestimating possible hydrogen-bonding schemes between the biopolymer andthe trial compound and conformations of hydrogen-bonding part of thetrial compound by comparing the distances between dummy atoms with thedistances between hydrogen-bonding heteroatoms while systematicallychanging the conformations of the trial compound; (3) step (c) ofoptimizing the conformations of the trial compound so that the positionsof the dummy atoms and the hydrogen-bonding heteroatoms in the trialcompound can be in accord with each other while retaining thehydrogen-bonding scheme obtained in the step (b), and then excluding theconformations of the trial compound having high intramolecular energies;(4) step (d) of obtaining the structure of a complex comprising thebiopolymer and the trial compound by converting the entire atomiccoordinates of the trial compound to the coordinate system of thebiopolymer for each conformation not excluded in the step (c) based onthe corresponding relationships between the hydrogen-bonding heteroatomsin the trial compound and the dummy atoms; (5) step (e) of excludingfrom docking structures of hydrogen-bonding parts obtained in the step(d) the docking structures in which intramolecular energies of thehydrogen-bonding parts of the trial compound and intermolecular energiesbetween the biopolymer and the hydrogen-bonding parts of the trialcompound are high, and then carrying out structure optimizations of theremaining docking structures; (6) step (f) of obtaining new dockingstructures comprising the whole trial compound by generating theconformations of non-hydrogen-bonding parts of the trial compound foreach docking structure obtained in the step (e); and (7) step (g) ofexcluding from the docking structures obtained in the step (f) thedocking structures in which intramolecular energies of the wholestructure of the trial compound and intermolecular energies between thebiopolymer and the trial compound are high, and then carrying outstructure optimizations of the remaining docking structures.
 5. A ligandcompound for a biopolymer which is retrieved by a method according toany one of claims 1 to
 4. 6. A database containing information ofthree-dimensional coordinates of trial compounds, which comprisesinformation selected from the group consisting of atomic type numbers,hydrogen bonding category numbers, atomic charge, and bond rotationschemes, wherein the information being appended to the three-dimensionalcoordinates of the trial compounds, and which is optionally appendedwith information of three-dimensional coordinates of hydrogen atoms. 7.A database according to claim 6 which is used for retrieving ligandcompounds for biopolymers.